Enable IVF index Save/Load functionality with dense clusters #260
Enable IVF index Save/Load functionality with dense clusters #260
Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements save and load functionality for IVF indices using dense cluster storage, enabling serialization and restoration of both static and dynamic IVF indices without requiring the original dataset.
Changes:
- Added save/load support for
DenseClusteredDataset,IVFIndex, andDynamicIVFIndex - Implemented directory-based and stream-based serialization formats
- Added comprehensive test coverage across C++ unit tests, integration tests, and Python bindings
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/svs/index/ivf/index.cpp | Added unit tests for IVF index save/load and DenseClusteredDataset serialization |
| tests/svs/index/ivf/dynamic_ivf.cpp | Added unit tests for DynamicIVF save/load functionality |
| tests/integration/ivf/index_search.cpp | Added integration tests for directory-based and stream-based IVF save/load |
| tests/integration/ivf/dynamic_scalar.cpp | Added integration tests for DynamicIVF save/load with recall verification |
| include/svs/orchestrators/ivf.h | Added save/load methods to IVF orchestrator interface and implementation |
| include/svs/orchestrators/dynamic_ivf.h | Added save/load methods to DynamicIVF orchestrator interface |
| include/svs/index/ivf/index.h | Implemented save method and load_ivf_index function for static IVF indices |
| include/svs/index/ivf/dynamic_ivf.h | Updated save method to use DenseClusteredDataset serialization and added load_dynamic_ivf_index function |
| include/svs/index/ivf/clustering.h | Implemented save/load methods for DenseClusteredDataset with binary serialization format |
| examples/python/example_ivf_dynamic.py | Updated example to demonstrate save/load functionality |
| examples/python/example_ivf.py | Added save/load demonstration to IVF example |
| bindings/python/tests/test_ivf.py | Added Python test for IVF save/load functionality |
| bindings/python/tests/test_dynamic_ivf.py | New file with comprehensive DynamicIVF tests including save/load |
| bindings/python/src/ivf.cpp | Added Python bindings for IVF save/load methods |
| bindings/python/src/dynamic_ivf.cpp | Added Python bindings for DynamicIVF load method |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
include/svs/index/ivf/clustering.h
Outdated
| // Constructor for empty clusters (for assembly/dynamic operations) | ||
| template <typename Alloc> | ||
| DenseClusteredDataset(size_t num_clusters, size_t dimensions, const Alloc& allocator) | ||
| // Note: This constructor creates empty clusters using the default allocator for Data |
There was a problem hiding this comment.
The documentation comment on line 346 incorrectly states this constructor uses the default allocator for Data. However, the constructor signature has been changed to remove the allocator parameter, so the comment should be updated to reflect this change.
| // Note: This constructor creates empty clusters using the default allocator for Data | |
| // Note: This constructor creates empty clusters with the given dimensionality |
ethanglaser
left a comment
There was a problem hiding this comment.
No objections from my end, nice work. There is a CI failure though in one of the ivf tests
This PR implements save and load functionality for IVF indices using dense cluster storage, enabling serialization and restoration of both static and dynamic IVF indices without requiring the original dataset.
Changes:
Added save/load support for DenseClusteredDataset, IVFIndex, and DynamicIVFIndex
Implemented directory-based and stream-based serialization formats
Added comprehensive test coverage across C++ unit tests, integration tests, and Python bindings